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ABSTRACT wel! 


The paper proposes a new architecture for 
autonomously generating and managing move- 
ment plans of planetary rovers. The system 
utilizes the uniform representation of the in- 
stantaneous subgoals in the form of virtual sen- 
sor states and the autonomous generation of 
the subsumption type plan network, which are 
expected to lead to the capability to persue 
the overall goal while efficiently managing var- 
ious unpredicted anomalies in a partially un- 
known, ill-structured environment such as a 
planetary surface. 

INTRODUCTION 

Among the autonomous functions required 
for future unmanned planetary rovers, the one 
especially required for such rovers will be the 
capabiliy to generate and manage various move- 


The paper proposes a novel architecture for 
autonomously generating and managing such 
movement plans of planetary rovers. The ar- 
chitecture is, basically, similar to the well-known 
subsumption architecture (Fig. 1 )[l] in the sense 
that the finally obtained movement plans are 
represented in the form of a hierachical su- 
pression/promotion network of primitive reflex 
actions such as “moving towards a prescribed 
point”, “wandering about”, “moving towards 
the reverse direction when a certain touch sensor 
senses an obstacle”, and so on. This repre- 
sentation of plans is, as has been discussed in 
many literatures, superior in 1) robustness in 
the actual world because no “symbolic world 
model” is utilized, 2) real-timeness because no 
complicated symbolic manipulation is required, 
and 3) easiness in system integration and ex- 


ment plans under partially unknown, ill-structured 
environments. For example, the path planning 
will be made based on the maps of the planet 
which will have been obtained beforehand by 
the observation from the planetary orbit, but 
these maps will not be so accurate and there 
will be in many cases lots of obstacles (such 
as small rocks or gaps) not represented on the 
maps. The path planning system, therefore, 
must be flexible enough to compensate for the 
inaccuracy of the maps, quickly respond to the 
unpredicted events such as collisions with the 



Figure 1. Subsumption Architecture 
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tension because a bottom-up-type system con- 
struction is quite easy. For this reason, this 
architecture is quite suit with the plan repre- 
sentation schema for rovers which move in an 
unstructured world. Its most significant de- 
parture from the conventional system concept 
is that the goal of the plan is not represented 
explicitly, but is achieved during the course 
of the interactions between the reflex actions’ 
network (called “RAN” hereafter) and the en- 
vironment. This feature is called “emergent 
functionality”. 



Figure 2. Schematic View of the Example Rover 


This architecture, however, has some diffi- 
cult problems to be solved before the actual 
use, such as; 1) the RAN must be sophisticat- 
edly designed by human designers so that the 
emergent functionality achieves the given goal, 
which is far more difficult task than to build 
a system which deals with the goal explicitly, 
and 2) once coded, the network is fixed during 
the actual operations, and the change of the 
environment or system itself cannot be dealt 
with. From these shortcomings, it can be said 
that the subsumption architecture cannot be 
employed in its original form for our objectives. 

We modified and enhanced the subsump- 
tion architecture in the following three points: 
1) uniform representation of the instantaneous 
subgoals is introduced in the form of virtual 
sensors so that the goal can be more explicitly 
persued, 2) the RAN is automatically gener- 
ated by compiling the database of the actions’ 
behavior networks obtained by machine learn- 
ing, and 3) the RAN is modified during the ac- 
tual operations to cope with the changes of the 
system and environment. The resultant sys- 
tem is expected to have the capability to per- 
suit the overall goal while efficiently and more 
flexibly managing various unpredicted anoma- 
lies in a partially unknown, ill-structured en- 
vironment such as a planetary surface. 

In the following explanation, it is assumed 
an example task to fetch a certain object which 
is placed at a certain position (not at the rover 
position) and to carry it to a prescribed goal 


position. The rover is assumed to have four 
touch sensors (each is sensitive to two direc- 
tion forth) and one camera, and be able to turn 
right/left and move forward/backward as illus- 
trated in Fig.2. It is assumed the rover knows 
its current position and orientation. 

NEW ARCHITECTURE 

Virtual Sensor States 

Various actions are uniformly represented in 
the form of change of sensor outputs. In order 
for the high-level tasks such a s “Plan Path” 
or “Write Obstacle Position to Map” to be 
represented in the same way, the state such 
as “whether the map is updated or not” or 
“whether there are no obstacles between the 
current target and the rover position” has also 
been represented as one “virtual” sensor state. 
For the example task, the eight sensor states 
(including three virtual sensor states) such as 

35 Xi. Head Angle fro* the Coal Direction 
( 0 - 360‘ ) 

10 It. Distance fro* the Goal 

0 Xi. Head Angle fro* the Object Direction 
( 0 - 360* ) 

0 X<. Distance fro* the Object 

3 Xs. Touch Sensor Output ( 0 - 8 ) 

( 2 directions x < sensors: 0 for no touch ) 

1 Xs. Object Carried ? 

( 0 for Yes and 1 for Ho ) 

1 Xt. No Obstacles betveen Target and Current Position ? 

{ 0 for No and 1 for Yes ) 

l Xt. Map Updated ? 

( 0 for Yes and 1 for No ) 

Exa*ple < Sensor State > 

Figure 3. Content of Sensor States 
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in Fig. 3 are employed (called Xj ~ X 8 .) The 
goal state for the example problem can be rep- 
resented as (*0*0*0* *) T . 

Learning of Behavior Network 

The plan management system learns when 
a certain action can be applied and how the 
action changes the sensor state. During the 
learning phase, the rover chooses actions ran- 
domly, which is continued until at least one 
of the sensor state changes. The change of 
the sensor state is defined as follows; for the 
discrete-value type states (such as X5 ~ X 8 ), 
any changes of the value, and for the continuous- 
value type states (the other states), transitions 
of the value between positive, negative and 
zero. Examples are described in the leftmost 
state transitions of Fig.4. These transitions 
are translated into the more abstract form of 
state transitions (the middle forms of Fig.4) 
and stored in the database. In this figure, the 
“* (wild card) ” means an arbitrary value, “>” 
means a positive value and “**” means that 
the value has not been changed from the one 
before the action is taken. 


employed generalization rules include “turn- 
ing a constant into a variable” rule, and “con- 
straint deletion” rule. If the generalization be- 
tween the current representation and the new 
instance would result in a trivial state transi- 
tion (such as that all the states are represented 
as * ), a disjunctive generalization is also intro- 
duced. Finally, several disjunctive representa- 
tions are obtained for each action. These state 
transitions are called “Behavior Networks” in 
this paper. 

Higher level actions such as path planning 
also have the behavior networks. As these net- 
works are hard to learn and can be easily de- 
fined beforehand, they are specified by the sys- 
tem designer. The anomalous events during 
the actual movements such as collisions with 
obstacles are also defined as state transitions. 

Compilation of Behavior Networks 

After behavior networks of all the actions 
become mature, they are compiled into a sub- 
sumption type plan network. The major tasks 
of this compilation are the identifications of 
sensor stimuli for each action to be fired and 


After accumulating large amount of such 
data for each action, the conventional induc- 
tive learning algorithm is applied to yield gen- 
eralized form of state transition of the action 
(such as the rightmost form of Fig.4.) The 
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Figure 4. Acquisition and Generalization of 
Behavior Network 


the extraction of priority relationships between 
the actions. The following rules are observed 
in constructing the plan network. 

(1) Actions are defined in the form of “con- 
tinue action A1 until X* becomes a certain 
constant c.” Therefore, for the “turn right” 
action, several variations of actions are gener- 
ated such as “turn right until the head angle 
from the goal direction becomes zero” or “turn 
right until touch sensors sense no forth”, and 
so on. 

(2) The actions whose consequences match 
the goal state are considered as candidates of 
the lowest level of the plan network. 

(3) If taking a cert ain action ( say A1 ) re- 
quires that a certain state be a certain value 
( 0 or other integers ), then the action ( say 
A2 ) whose consequences satisfy this precon- 
dition is categorized as a candidate of action 
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which must be performed before A1 ( in other 
word, whose firing suppresses the activation of 
Al.) The preconditions of A1 which are not 
explicitly satisfied by A2 are registered as the 
stimuli for firing Al. Then all the consequence 
states of A2 are matched with the precondition 
states of Al, and the precondition states of A2 
are replaced with the values obtained by this 
matching. 

(4) Many hierarchical relationships will be 
acquired in the above processes. From these, 
the best plan network is obtained by searching 
the space of all the combinations, based on the 
following criteria; 

- The network does not have any loops. 

- The network can lead the system to the 
goal state from arbitrary states. 

Figure 5 describes the obtained plan net- 
work for the example problem. In this figure, 
the wave line shows the “Supression Signal.” 
For example, the action “MF(X 4 =0)” (stands 


Xs*l WO(Xi»0) *~i 10: fritc Obstacle 

to the Map 

Xt*1 *■ PP(Xt-O) PP: Plan Path to 

} the Goal 

Xs*4 or 6 » TR (X$*0) TR : Turn Right 

X$ s 2 or 8 ♦ TL (Xs m 0) ~l TL : Turn Left 

Xs*l or 3 ♦ MB (Xs=0) MB: Move Backward 

X s * 5 or 7 ■* MP (Xs*0) ^ MF : Move Forward 

* f 

Xe * 1 * Xa>0 + TR (X 3 *0) 

X e * 1 * Xa<0 -> TL ( X a * 0 ) ~ 

L 

X c * 1 ‘ Xi>0 V MF (X 4 =0) 

Xi >0 ► TR (X i*0) 

Xi <0 » TL (Xi *0) — 

L. 

X2>0 * MF <X 2 *0) 

< Stiauli > < Action (stopping condition) > 

Supression Signal 

Figure 5. Obtained Plan Network for the 
Example Problem 


for “move forward until X 4 =0”) must be per- 
formed preferentially if X 4 =0 is not satisfied 
when trying to start action “TR(X!=0)” or 
“TL(Xi=0)”. When trying to start “MF(X 4 =0) 
”, if X 3 =0 is not satisfied, then the action 
“TR(X 3 =0)” or “TL(X 3 =0)” is performed ac- 
cording to the sign of X 3 . In this way, the 
plan network takes into account the priority 
relationships between actions and the anomaly 
handling (such as separating from a obstacle 
when a touch sensor finds it) as well. For ex- 
ample, if the rover, during a certain action (say 
Al), collides with an obstacle (X 7 and X 8 be- 
come 1), which first triggers the action “write 
obstacle position to the map (WO)” to change 
X 8 to 0, and then triggers “plan path (PP)” 
to change X 7 to 0. Then the system resumes 
Al, and if another action with higher priority 
is not triggered, action Al is continued. Please 
note that as a side effect of the WO and PP 
actions, the states Xj ~X 4 will be changed. 

If the consequence of a certain action is found 
inconsistent with the learned behavior network, 
then the learning of the correct behavior net- 
work is re-initiated for the specific action, which 
also triggers the recompilation of the behav- 
ior networks into the plan network. With this 
technique, the system has the flexibility to adapt 
itself to the change of the environment or the 
system itself. 

CONCLUSIONS 

An architecture to manage the rover move- 
ment plans under ill-structured, partially un- 
known environments has been proposed. Sim- 
ulation studies have indicated the effectiveness 
of the architecture, and experiments using an 
actual rover-type vehicle is now being performed. 
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